Maximum Entropy Summary Trees

نویسندگان

Howard J. Karloff

Kenneth E. Shirley

چکیده

Given a very large, node-weighted, rooted tree on, say, n nodes, if one has only enough space to display a knode summary of the tree, what is the most informative way to draw the tree? We define a type of weighted tree that we call a summary tree of the original tree that results from aggregating nodes of the original tree subject to certain constraints. We suggest that the best choice of which summary tree to use (among those with a fixed number of nodes) is the one that maximizes the information-theoretic entropy of a natural probability distribution associated with the summary tree, and we provide a (pseudopolynomial-time) dynamic-programming algorithm to compute this maximum entropy summary tree, when the weights are integral. The result is an automated way to summarize large trees and retain as much information about them as possible, while using (and displaying) only a fraction of the original node set. We illustrate the computation and use of maximum entropy summary trees on five real data sets whose weighted tree representations vary widely in structure. We also provide an additive approximation algorithm and a greedy heuristic that are faster than the optimal algorithm, and generalize to trees with real-valued weights.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Algorithms for Constructing Maximum Entropy Summary Trees

Karloff and Shirley recently proposed “summary trees” as a new way to visualize large rooted trees (Eurovis 2013) and gave algorithms for generating a maximum-entropy k-node summary tree of an input n-node rooted tree. However, the algorithm generating optimal summary trees was only pseudo-polynomial (and worked only for integral weights); the authors left open existence of a polynomial-time al...

متن کامل

Entropy based classification trees

One method for building classification trees is to choose split variables by maximising expected entropy. This can be extended through the application of imprecise probability by replacing instances of expected entropy with the maximum possible expected entropy over credal sets of probability distributions. Such methods may not take full advantage of the opportunities offered by imprecise proba...

متن کامل

Maximum Entropy modeling for mwe identification. Comparison with decision trees

This report begins with a description of the maximum entropy framework commonly used in modeling natural language processing tasks. As far as our knowledge goes, this represents a first attempt to model multiword expression identification using a maximum entropy model. A related sort of models, loglinear models, have been previously used for automated identification of phrasal verbs (eg. look s...

متن کامل

Context-dependent acoustic modeling based on hidden maximum entropy model for statistical parametric speech synthesis

semi Markov models (HSMMs) are typically used in statistical parametric speech synthesis to represent probability densities of acoustic features given contextual factors. This paper addresses three major limitations of this decision tree-based structure: i) the decision tree structure lacks adequate context generalization; ii) it is unable to express complex context dependencies; iii) parameter...

متن کامل

Understanding Privacy Risk of Publishing Decision Trees

Publishing decision trees can provide enormous benefits to the society. Meanwhile, it is widely believed that publishing decision trees can pose a potential risk to privacy. However, there is not much investigation on the privacy consequence of publishing decision trees. To understand this problem, we need to quantitatively measure privacy risk. Based on the well-established maximum entropy the...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Comput. Graph. Forum

دوره 32 شماره

صفحات -

تاریخ انتشار 2013

Maximum Entropy Summary Trees

نویسندگان

چکیده

منابع مشابه

Fast Algorithms for Constructing Maximum Entropy Summary Trees

Entropy based classification trees

Maximum Entropy modeling for mwe identification. Comparison with decision trees

Context-dependent acoustic modeling based on hidden maximum entropy model for statistical parametric speech synthesis

Understanding Privacy Risk of Publishing Decision Trees

عنوان ژورنال:

اشتراک گذاری